Fourier - Bessel based Cepstral Coefficient Features for Text-Independent Speaker Identification

نویسندگان

  • Chetana Prakash
  • Suryakanth V. Gangashetty
چکیده

This paper proposes the Fourier-Bessel cepstral coefficients (FBCC) as features for robust text-independent speaker identification. Fourier-Bessel (FB) expansion is used instead of Fourier transform for representing the signal in frequency domain. FB expansion can be viewed as two-dimensional Fourier transform. Change in the kernel of the transform from exponential to decaying exponentials helps in viewing the speech signal as a linear sum of decaying exponentials. For signals arising out of acoustic tubes, where the signal is subjected to many damping effects, delays in the different components of the signal is inevitable. Representing such signals using FB coefficients helps in able identification of different components present in the signal. The random non-stationary nature of speech signal is more efficiently represented by damped sinusoidal nature of basis function that is more natural for the voiced speech signal since Bessel functions have damped sinusoidal as basis function, so it is more natural choice for the representation of natural signals. Vocal tract is modeled as a set of linear acoustic tubes being cylindrical in shape can be efficiently modeled using FB expansion because Bessel functions are solutions to cylindrical wave equations. The proposed approach to speaker identification is based on FBCC features, and method employ Gaussian mixture for modeling the speaker characteristics. However, we have build the speaker models from the Fourier-Bessel features derived from the speech samples, as an alternative to Mel-frequency cepstral coefficients (MFCC) and linear prediction cepstral coefficients (LPCC) for building the speaker models. An evaluation of the Gaussian mixture model is conducted on TIMIT database which consists of 630 speakers and 10 speech utterances per speaker and white noise signals of TIMIT database having various SNRs of 50, 40, 30 and 20 dB. Using the statistical model like Gaussian mixture model (GMM) and features extracted from the speech signals build a unique identity for each person who enrolled for speaker identification [1]. Estimation and Maximization algorithm is used for finding the maximum likelihood solution for a model with features, to test the later speeches against the database of all speakers who enrolled in the database. Experimental results shows that the FBCC can be used as the alternate feature for the LPCC and MFCC since it can improve the performance of the speaker identification task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

New Filter Structure based on Admissible Wavelet Packet Transform for Text-Independent Speaker Identification

Identical acoustic features like Mel frequency cepstral Coefficients (MFCC)and Linear predictive cepstral coefficients (LPCC) are being widely used for different tasks like speech recognition and speaker recognition, whereas the requirement of speaker recognition is different than that of speech recognition. In MFCC feature representation, the Mel frequency scale is used to get a high resolutio...

متن کامل

A Framework for Multilingual Text- Independent speaker identification System

This article evaluates the performance of Extreme Learning Machine (ELM) and Gaussian Mixture Model (GMM) in the context of text independent Multi lingual speaker identification for recorded and synthesized speeches. The type and number of filters in the filter bank, number of samples in each frame of the speech signal and fusion of model scores play a vital role in speaker identification accur...

متن کامل

Language and Text-Independent Speaker Identification System Using GMM

This paper motivates the use of Dynamic Mel-Frequency Cepstral Coefficient (DMFCC) feature and combination of DMFCC and MFCC features for robust language and text-independent speaker identification. MFCC feature, modeled on the human auditory system has been the widely used feature for speaker recognition because of its less vulnerability to noise perturbation and little session variability. Bu...

متن کامل

Speaker Identification Based on Vector Quantization

In this paper a method of text-independent speaker recognition using discrete vector quantization is presented. The identification experiments were performed in a closed set of 599 speakers and two various types of features were tested: cepstral mean subtraction coefficients and mel-frequency cepstral coefficients. The effect of the various codebook size on the speaker identification performanc...

متن کامل

Multiband Approach to Robust Text-independent Speaker Identification

This paper presents an effective method for improving the performance of a speaker identification system. Based on the multiresolution property of the wavelet transform, the input speech signal is decomposed into various frequency bands in order not to spread noise distortions over the entire feature space. To capture the characteristics of the vocal tract, the linear predictive cepstral coeffi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011